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Abstract 

This paper proposes a framework dedicated to the construction of what we 
call time elastic inner products allowing one to embed sets of non-uniformly 
sampled multivariate time series of varying lengths into vector space struc- 
tures. This framework is based on a recursive definition that covers the case 
of multiple embedded time elastic dimensions. We prove that such inner 
products exist in our framework and show how a simple instance of this in- 
ner product class operates on some toy or prospective applications, while 
generalizing the Euclidean inner product. 

Keywords: Vector Space, Discrete Time Series, Sequence mining, Non 
Uniform Sampling, Elastic Inner Product, Time Warping 



1. Introduction 

Time series analysis in metric spaces has attracted much attention over 
numerous decades and in various domains such as biology, statistics, soci- 
ology, networking, signal processing, etc, essentially due to the ubiquitous 
nature of time series, whether they are symbolic or numeric. Among other 
characterizing tools, time warp distances (see [ll, [it, and more recently jsj, 

among other references) have shown some interesting robustness compared 
to the Euclidean metric especially when similarity searching in time series 
data bases is an issue. Unfortunately, this kind of elastic distance does not 
enable direct construction of definite kernels which are useful when address- 
ing regression, classification or clustering of time series. A fortiori, they do 
not make it possible to directly construct inner products involving some time 
elasticity, which are namely able to cope with some time stretching or some 
time compression. Recently, jsf have shown that it is quite easy to propose 
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inner product with time elasticity capability at least for some restricted time 
series spaces, basically spaces containing uniformly sampled time series, all 
of which have the same lengths (in such cases, time series can be embedded 
easily in Euclidean spaces). 

The aim of this paper is to derive an extension from this preliminary work 
for the construction of time elastic inner products, to achieve the construc- 
tion of a time elastic inner product for a quasi-unrestricted set of time series, 
i.e. sets for which the times series are not uniformly sampled and have any 
lengths. Section two of the paper, following preliminary results presented in 
[5I, gives the main notations used throughout this paper and presents a re- 
cursive construction for inner-like products. It then gives the conditions and 
the proof of existence of time elastic inner products (and time elastic vector 
spaces) defined on a quasi-unrestricted set of times series while explaining 
what we mean by quasi-unrestricted. The third section succinctly presents 
some applications, mainly to highlight some of the features of Time Elastic 
vector Spaces such as orthogonality. 



2. Discrete Time Elastic Vector Spaces 

2.1. Sequence and sequence element 

Definition 2.1. Given a finite sequence A we note A{i) the z*^ element (sym- 
bol or sample) of sequence A. We will consider that A{i) G S x T where 
{S, (Bs, ^s) is a vector space that embeds the multidimensional space vari- 
ables (e.g. S C M.'^, with d G N"*") and T C M embeds the timestamps variable, 
so that we can write A{i) = {a{i),ta{i)) where a{i) G S and ta[i) G T, with 
the condition that ta{i) > ta(j) whenever i > j (timestamps strictly increase 
in the sequence of samples). Al with i < j is the subsequence consisting of 
the ith through the jth element (inclusive) of A. So A^ = A{i)A{i -\- l)...A{j). 
A denotes the null element. By convention Aj with i > j is the null time 
series, e.g. Q. 

2.2. Sequence set 

Definition 2.2. The set of all finite discrete time series is thus embedded 
in a spacetime characterized by a single discrete temporal dimension, that 
encodes the timestamps, and any number of spatial dimensions that encode 



2 



the value of the time series at a given timestamp. We note U = {^ilp G N} 
the set of all finite discrete time series. is a time series with discrete index 
varying between 1 and p. We note fl the empty sequence (with null length) 
and by convention = Q so that is a member of set U. 1^41 denotes 
the length of the sequence A. Let Up = {A G U | 1^41 < p} he the set of 
sequences whose length is shorter or equal to p. Finally let U* be the set of 
discrete times series defined on {S — {O5}) x T, i.e. the set of time series 
that do not contain the null spatial value. We denote by O5 the null value in S. 

2.3. Scalar multiplication on U* 

Definition 2.3. For all A G U* and all A G M, C = A (g) A e U* is such that 
for alH G N such that < i < \A\, C{i) = (A.a(i), and thus \C\ = \A\. 

2.4- addition on U* 

Definition 2.4. For all {A, B) G (U*)^ the addition of A and B, noted 
C = AQ) B E V* , is defined in a constructive manner as follows: _ Let i,j and 
k be in N. 

k = i=j = l, 

As far as 1 < 2 < I A I and 1 < j < \B\, 
if ta. < tfej, C{k) = {a{i),taj and i i + l,k ^ k + 1 
else if 4. > ^ , C{k) = ^6,) and j ^ j + 1, A; ^ A; + 1 
else if ai + bj ^ 0, C{k) = {a{i) + b{j),tai) andi ^ i + ^ j + l,k ^ k + 1 
else i^i + l,j^j + l 

Three comments need to be made at this level to clarify the semantic of 
the operator ©: 

i) Note that the © addition of two time series of equal lengths and uni- 
formly sampled coincides with the classical addition in vector spaces. 
Fig. [1] gives an example of the addition of two time series that are not 
uniformly sampled and that have different lengths. 

ii) Implicitly (in light of the last case described in Def. 12. 4p . any sequence 
element of the sort (0^, t), where is the null value in S and t G T must 
be assimilated to the null sequence element A. For instance, the addition 
of A = (1, 1)(1, 2) with B = (-1, 1)(1, 2) is C = A © 5 = (2, 2): the 
addition of the two first sequence elements is (0, 1) that is assimilated 
to A and as such suppressed in C. 
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Figure 1: The ® binary operator when appHed to two discrete time series of variable 
lengths and not uniformly sampled. Co-occurring events have been slightly separated at 
the top of the figure for readability purposes. 

iii) The © operator, when restricted to the set U* is reversible in that if 
C = A®B then A = C® ((-1) O 5) or 5 = C © ((-1) © A). This is 
not the case if we consider the entire set U. 

2.5. Time elastic product (TEP) 

Definition 2.5. A function <.,.>: U* x U* M is called a Time Elastic 
Product if, for any pair of sequences A^, B\, there exists a function f : S"^ 
M, a non negative symmetric function g : T'^ ^ and three constants a, /3 
and ^ in M such that the following recursive equation holds: 



< Al, Bl > 



tep 



a- < A? ^,Bl >tep 



'tep 



a-<AlBl-^ >t 



This recursive definition requires defining an initialization. To that end 
we set, WA G U*, < A,Q >tep=< ^,A >tep=< >tep= ^, where ^ is a 
real constant (typically we set ^ = 0), and Q is the null sequence, with the 
convention that Ai^ = Q whenever i > j. 



It has been shown in [5[ that time elastic inner products can easily be 
constructed from Def. I2.5l using the © and ® operations when we restrict the 
set of time series to some subset containing uniformly sampled time series 
of equal lengths (in that case, the © coincides with the classical addition on 
5*). For instance, definitions 12.61 and 12 .71 recursively define two TEP that are 
inner products on such restrictions. 

Definition 2.6. 

<Al,Bt>, 



'twipi 3 

E <! < Al-\ Bl' >u,rp, +e-'^(*'^w.*K,))(a(p) ■ 6(g)) 

'twipi 

where c? is a distance, and u a time stiffness parameter. 



^ ^1 5 -°1 ^twipi 

^twipi - ''^'■'—'^■"-ya\^p) ■ uyqj) 

<AlBl-' >t 



Definition 2.7. 

^ ap ui — L. 

<^ ^l)-Dl -^twip2 1+2-e 

E <^ < A{-\Bl' >,^,p, +e-'^AUr,^M.,){a{p) ■ b{q)) 

\ ^ ■ < A^, B^ ^twip2 

where ci is a distance, and u a time stiffness parameter. 



It can be shown that < ., . >twip2 coincides with the Euclidean inner prod- 
uct on the considered restrictions of U when — t- oo. 
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This paper addresses the more interesting question of the existence of 
similar elastic inner products on the set U* itself, i.e. without any restriction 
on the lengths of the considered time series nor the way they are sampled. 
If the choice of functions / and g, although constrained, is potentially large, 
we show hereinafter that the choice for constants a, (3 and ^ is unique. 

2.6. Existence ofTEP inner products defined on U* 
Theorem 2.1. < ., . >tep is an inner product on (U*, ©, (g)) iff': 

i) ^ = 0. 

ii) /i : (S'xT) — 7- M defined as h{{a, to)) = f{a,a)-g(ta,ta) is strictly positive 
on{iS-{Os})xT), 

iii) / is an inner product on {S,(Bs,^s)> if we extend the domain of f on 
S while setting f{0s,0s) = 0. 

iv) a = 1 and (3 = —1, 

2.6.1. proof of theorem \2.1\ 

Proof of the direct implication 

Let us suppose first that < ., . >tep is an inner product defined on U*. Then 
< .,. >tep is positive-definite, and thus < f2, f2 >tep= ^ = 0. Furthermore, 
for any A = (a, ta) G U*, <A,A >tep= h{a, ta)) > 0. Thus i) and ii) are sat- 
isfied. As g is non-negative, if we set /(O^, O5) = 0, / is positive-definite on S. 

It is also straightforward to show that / is symmetric if g and < ., . >tep 
are symmetric. 

Since ^ = 0, for any A, B, and C G U* such that A = {a,t), B{b,t) and 
C = (c, tc), we have: 

<A®B,C >tep= h{{a ®s b, t), (c, tj) = /(a ©5 b, c).g{t, t^). 

As < A® B,C >tep = < A, C >tep + <B,C >tep 

= h{ia,t),{c,t,)) + hi{b,t),{c,t,)) 

= /(a, c).g{t, tc) + /(a, c).g{t, t^) = {f{a, c) + f{b, c)).g{t, tc), 
As g is non negative, we get that /(a ®s b, c) = (/(a, c) + f{b, c)). 
Furthermore, < X® A,C >tep= h{{\ ci, t), (c, tc)) = /(A ©5 a, c).g(t, tc). 
As < A®v4, C >tep= X. < A,C >tep= ^■f{ci,c).g(t,tc) and g is non negative, 
we get that /(A <S)s c) = A. /(a, c). 

This shows that / is linear, symmetric and positive-definite. Hence it is an 
inner product on (5, ©5, ©5) and iii) is satisfied. 
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Let us show that necessarily a — 1 and /3 — —1. To that end, let us 
consider any A^,B'l and C[ in U*, such that p>l,g>l,r>l and such 
that 4, < U,, i.e. if XI = A{® Bl then = A? © Bl-\ 
Since by hypothesis < ., . >tep is an inner product (U*, ©, 0), it is linear and 
thus we can write: 

< Ai © Bf, CI >tep—^ CI >tep + < Bf, CI >tep- 

Decomposing < A^® Bf, C'[ >tep, we obtain: 
<Al® Bl CI >tep= a.<A\® Bl~\ CI >tep + 

P.<A\® Bl\ C[-i >tep +f{bg, cr).g{h„t,^) + a. < A? © Bj, CI'' >tep 
As < ., . >iep is linear we get: 

< A? © Bf, C[ >tep= a. < AP, CI >tep +«. < Bl-\ CI >tep + 
13. < AIC[-^ >tep +13. < Bl\Cl-' >tep +f{bg,Cr).9{U^,t,^)+ 
a. < A{, Cr ' >tep +a. < Bl, C^' >tep 

Hence, 

< A^ © Bf, C[ >tep— Oi- < ^1) C\ >tep < Ai, C[ ^ >tep + 
a. < Ai, C[ >tep + < -^1) ^tep 

If we decompose < A^ C{ >tep, we get: 
<Al® B'l, C[ >tep= {a' + f3 + a)_< A\, C[-i >tep +a.[3. < A^'', Cr' >tep 

+a.f{ap, Cr).g{ta^,tJ + O?. < A\-^ , C\ >tep + < Bl, CI >tep 

Thus we have to identify < Af, C{ >tep= «. < Af, C[~^ >tep +(3. < 

Al-\ C[-^ >tep +/(ap, Cr).gita^, tcj + Ct. < Al'\ CI >tep 

with (a2+/3+a) < A\,Cl-^ >tep +«./?. < ^r\Cr' >tep c.).5(t„„ tcj + 

q;2. <Ar\C[>tep. 

The unique solution is « = 1 and (3 = —1. That is if < ., . >tep is an 
existing inner product, then necessarily a = 1 and /3 = — 1, establishing iv). 

Proof of the converse implication 

Let us suppose that i), ii), iii) and iv) are satisfied and show that < ., . >tep 
is an inner product on U*. 

First, by construction, since / and g are symmetric, so is < ., . >tep- 

It is easy to show by induction that < ., . > tep is non-decreasing with the 
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length of its arguments, namely, VAf and Bf in U*, 

< A^, Bf >tep — < A\, Bl~^ >tep^ 0. Let n = p + q. The proposition is 
true at rank n = 0. It is also true if = Q, whatever Bl is, or Bf = O, 
whatever < A{ is. Suppose it is true at a rank n > 0, and consider A'l ^ Q 
and B\^ such that p + q — n. 

By decomposing < A^, Bf >tep we get: 

< Al Bl >,ep - < A^ Bl' >tep= - < Al-\ Bl' >t,p +f{ap, h,).g{ta,,\)+ < 

^1 ) ^tep 

Since f{ap,bq).g{tap,tf,^) > and the proposition is true by inductive hy- 
pothesis at rank n, we get that < A'^, B\ >tep — < Bl' >tep) > 0. By 
induction the proposition is proved. 

Let us show by induction on the length of the times series the positive 

definiteness of < ., . >tep- 

At rank we have < Q,Q >tep= C = 0- At rank 1, let us consider any time 
series of length 1, ^1^. < A\, A] >tep= f{ai,ai).g{ta^,ta^) > by hypothesis 
on / and g. Let us suppose that the proposition is true at rank n > 1 and let 
consider any time series of length n + 1, Al'. Then, since a = 1 and /3 = —1, 

< Al , Al >tep= 2. < Al , A" >tep — < A'^,A'^ >tep f {o-n+l, (^n+l) -Oitan+n ^an+l) ■ 

Since < Al\ A"^ >tep - < A'l.A'l >tep> 0, and h{A{n + 1) , A{n + 1) > 0, < 
Al', Al' >tep> 0, showing that the proposition is true at rank n+1. By in- 
duction, the proposition is proved, which establishes the positive-definiteness 
of < ., . >tep since < A^, A^ >tep^ only if A{ = Q. 

Let us consider any A G M, and any A^, Bf in U* and show by induction 
on n^p + q that< A (g) A^, Bf >tep= A. < A{, Bf >tep: 
The proposition is true at rank n — 0. Let us suppose that the proposition 
is true at rank n > 0, i.e. for all r < n, and consider any pair A^, Bf of time 

series such that p + q = n + 1. 

We have: < \^Af,Bf >tep= a. < X(^Af,Bl' >tep +l3. < X0Al\Bf-' >tep 
+/(A (g)5 ttp, bg).g{ta^, \) + a. < A (8) Al\ Bf >tep 

Since / is linear on (-S, ©5, (8)5), and since the proposition is true by hy- 
pothesis at rank n, we get that < X® ^4^, Bf >tep= X.a < Af, Bf~' >tep 
+X.f3. < Al\Bf-' >tep +X.f{ap,bg).g{ta„U,) + X.a. < Al\Bf >tep^ 

X.<AlBf>tep. 

By induction, the proposition is true for any n, and we have proved this 
proposition. 
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Furthermore, for any A^jEf and C[ in U*, let us show by induction on 
n=p + q + r that < © Bf, >tep=< >tep + < Bf, C[ >t^p. Let 

XI be equal to (B Bf. The proposition is obviously true at rank n = 0. 
Let us suppose that it is true up to rank n > 0, and consider any A{, Bf and 
C{ such that p + q + r = n + l. 

Three cases need then to be considered: 

1) if = A{-' © Bl-\ then ta^ = \ =t and < A\ ® B^Cl >tep= 
a. < A?©5?,C[-i >tep < A{-' (BBl\Cl-' >t,, +f{{a,+ 
bq), Cr).g{t, tcj+a- < A^ ®Bl , C[ >tep- Since / is linear on (5*, ©5, ©5), 
and the proposition true at rank ra, we get the result. 

2) if Xl'^ = A\® Bf-\ then t^^ <\=t and < A? © 5?, C[ >tep= a. < 
Al © 5?, >tep < A? © >tep c,).(7(t, tej + «. < 

® Bl ,C{ >tep- Having a = 1 and (3 = —1 with the proposition 
supposed to be true at rank n we get the result. 

3) if Xi~^ = Ai'^^ © Bf'^^ , we proceed similarly to case 2). 

Thus the proposition is true at rank n + 1, and by induction the propo- 
sition is true for all n. This establishes the linearity of < ., . >tep- 
This ends the proof of the converse implication and theorem 12.11 is therefore 
established □ 

The existence of functions / and g entering into the definition of < ., . >tep 
and satisfying the conditions allowing for the construction of an inner product 
on (U*, ©, ©) is ensured by the following proposition: 

Proposition 2.2. The functions / : 5^ — )■ M defined as f{a,b) =< a,b >s 
where < ., . >s is an inner product on {S, ©5, ©5) and g : R defined as 

f{ta-itb)) = e"'^'-*'"*''-*, where d is a distance defined on and v E M^, satisfy 
the conditions required to construct an elastic inner product on (U*,©, ©). 

The proof of Prop J2.2] is obvious. This proposition establishes the exis- 
tence of TEP inner products, that we will denote TEIP (Time Elastic Inner 
Product). Note that < ., . >s can be chosen to be a TEIP as well, in the 
case where a second time elastic dimension is required. This leads naturally 
to recursive definitions for TEP and TEIP. 

Proposition 2.3. For any n eN, and any discrete subset T = {ti,t2, ■ ■ ■ , t^} C 
M, let Vn,R,T be the set of all time series defined on'RxT whose lengths are n 
(the time series in l]n,R,T o^^e considered to be uniformly sampled). Then, the 
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TEIP on Vn,R constructed from the functions f and g defined in Prop. \2.2\ 
tends towards the Euclidean inner product when u oo if S is an Euclidean 
space and < a,b >s is the Euclidean inner product defined on S . 

The proof of Prop J2.3] is straightforward and is omitted. Prop 12. 3] shows 
that a TEIP generahzes the classical Euclidean inner product. 



3. Some applications 

We present in the following sections some applications to highlight the 
properties of Time Elastic Vector Spaces (TEVS). 

3.1. Distance m TEVS 

The following proposition provides U* with a norm and a distance, both 
induced by a TEIP. 

Proposition 3.1. For all A\ G U*, and any < .,. > TEIP defined on 
(U*, ©, ®) a/< A\, A{ > is a norm on U*. 

F or all pair \A\, Bl) € (U*)^ and any TEI P defined on (U*, ©, ®), 5{Al, Bf) = 
a/< © (-1. ® Bl), A1 © (-1. (g) Bf) > defines a distance metric on U*. 

The proof of Prop. 13. H is straightforward and is omitted. 

3.2. Orthogonalization in TEVS 

To exemplify the effect of elasticity in TEVS, we give below the result 
of the Gram-Schmidt orthogonalization algorithm for two families of inde- 
pendent time series. The first family is composed of uniformly sampled time 
series having increasing lengths. The second family (a sine-cosine basis) is 
composed of uniformly sampled time series, all of which have the same length. 

The tests which are described in the next sections were performed on a 
set U* of discrete time series whose elements are defined on (M — {0} x [0; 1])^ 
using the following TEIP: 
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(4) 



3.2.1. Orthogonalization of an independent family of time series with increas- 
ing lengths 

The family of time series we are considering is composed of 11 time series 
uniformly sampled, whose lengths are 11 samples: 



Since, the zero value cannot be used for the space dimension, we replaced 
it by e, which is the smallest non zero positive real for our test machine (i.e. 
2^1074 -J rjj^g result of the Gram-Schmidt orthogonalization process using 
V = .01 on this basis is given in FigJ2J 

3.2.2. Orthogonalization of a sine-cosine basis 

An orthonormal family of discrete sine-cosine functions is not anymore 
orthogonal in a TEVS. The result of the Gram-Schmidt orthogonalization 
process using u = .01 when applied on a discrete sine-cosine basis is given in 
FigJSl in which only the 8 first components are displayed. The lengths of the 
waves are 128 samples. 

3.3. Kernel methods in TEVS 

A wide range of literature exists on kernels, among which joj , 0] and jsf 
present some large syntheses of major results. 

Definition 3.1. A kernel on a non empty set U refers to a complex (or real) 
valued symmetric function (f{x,y) : ?7 x t/ — )■ C (or R). 

Definition 3.2. Let f/ be a non empty set. A function : U x U — ?■ C is 
called a positive (resp. negative) definite kernel if and only if it is Hermitian 

(i.e. (p{x,y) = (f{y,x) where the overline stands for the conjugate number) 



(1,0) 

(e,0)(l,l/10) 
(e,0)(e,0)(l,l/10) 



(5) 



(e,0)(e,l/10)(e,2/10)---(l,l) 
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Figure 2: Result of the orthogonalization of the family of length time series defined in Eq[5] 
using u = .01: except for the first spike located at time 0, each original spike is replaced 
by two spikes, one negative the other positive. 
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Figure 3: Orthogonalization of the sine-cosine basis using v = .01: the waves are sHghtly 
deformed jointly in amplitude and in frequency. For readability of the figure, we have 
presented the 8 first components 

for all X and yinU and Yhj=i CiCj(p{xi, Xj) > (resp. Yl^j^i CiCjip{xi, xj) < 
0), for all n in N, (xi, X2, Xn) £ C/" and (ci, C2, c^) e C". 

Definition 3.3. Let C/ be a non empty set. A function (/?:[/ x [/—>■ C is 
called a conditionally positive (resp. conditionally negative) definite kernel 
if and only if it is Hcrmitian (i.e. (p{x, y) = f{y, x) for all x and y in U) and 
Yli,j=iCiCjf{xi,Xj) > (resp. Yli,j=i<^i^j¥'{xi,Xj) < 0), for all n > 2 in M, 
{xi,X2, Xn) e W and (ci, C2, c„) e C" with ^"^^ q = 0. 

In the last two above definitions, it is easy to show that it is sufficient to 
consider mutually different elements in U, i.e. collections of distinct elements 



13 



Definition 3.4. A positive (resp. negative) definite kernel defined on a finite 
set U is also called a positive (resp. negative) semidefinite matrix. Similarly, 
a positive (resp. negative) conditionally definite kernel defined on a finite set 
is also called a positive (resp. negative) conditionally semidefinite matrix. 

3.3.1. Definiteness ofTEIP based kernel 
Proposition 3.2. A TEIP is a positive definite kernel. 

The proof of Prop. 13.21 is straightforward and is omitted. 

3.3.2. SVM classification using a TEP based kernel 

In jsl, < .,. >twip2 (Eq J2.7p have been experimented on a classification 
task using a SVM classifier on 20 datasets containing times series uniformly 
sampled and having the same lengths inside each dataset. On the same 
data, we get similar results for < ., . >teip (EqS]) and do not report them 
in this paper. The benefit of introducing some time elasticity, controlled 
using the parameter v is quite clear when comparing the classification error 
rates obtained using a Gaussian kernel exploiting the distance derived from 
< ., . >teip (Prop. 13. ip with the classification error rates obtained using a 
Gaussian kernel exploiting the Euclidean distance. 

3.4. Elastic Cosine similarity in TEVS, with application to symbolic (e.g. 
textual) information retrieval 

Similarly to the definition of the cosine of two vectors in Euclidean space, 
we define the elastic cosine of two sequences by using any TEP that satisfies 
the conditions of theorem 12.11 

Definition 3.5. Given two sequences, A and B, the elastic cosine similarity 
of these two sequences is given using a time elastic inner product < X, y >e 
and the induced norm ||X||e = yj< X, X >e as 
similarity = cose{0) = p^(^^b{\ 

In the case of textual information retrieval, namely text matching, the 
timestamps variable coincides with the index of words into the text, and the 
spatial dimensions encode the words into a given dictionary. For instance, 
each word can be represented using a vector whose dimension is the size of 
the set of concepts (or senses) that cover the conceptual model associated 
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to the dictionary and each coordinate selected into [0; 1] encodes the degree 
of presence of the concept or senses into the considered word. In that case, 
the elastic cosine similarity measure takes value into [0; 1], indicating the 
lowest possible similarity value between two texts and 1 the greatest possible 
similarity value between two texts. The elastic cosine similarity takes into 
account the order of occurrence of the words into a text which could be an 
advantage compared to the Euclidean cosine measure that does not cope with 
the words ordering. 

Let us consider the following elastic inner product dedicated to text 
matching. In the following definition, and Bf are sequences of words 
that represent textual content. 

Definition 3.6. 

^ ap — 



where S{x,y) = 1 if x = y {x and y identify the same word), otherwise, 
and u a time stiffness parameter. 

Proposition 3.3. Foru = 0, the elastic inner product defined in Eq lS.& coin- 

cides with the euclidean inner product between two vectors whose coordinates 
correspond to term frequencies observed into the A\ and Bf text sequences. 
If, we change the definition of 6 by the S{x,y) = IDF{x) if x = y, oth- 
erwise, where IDF{x) is the inverse document frequency of term x into the 
considered collection, then for u = 0, < A^, Bf >teiptm coincides with the 
euclidean inner product between two vectors whose coordinates correspond to 
the TF-IDF (term frequency times the inverse document frequency) of terms 
occurring into the A\ and Bf text sequences. 

The proof of proposition 13.31 is straightforward an is omitted. 

Thus, the elastic cosine measure derived from the elastic inner product 
defined by Eq J3.6l generalizes somehow the cosine measure implemented in 
the vector model [9| and commonly used in the text information retrieval 
community. 




(6) 
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4. Conclusion 

This paper proposed what we call a family of time elastic inner products 
able to cope with non-uniformly sampled time series of various lengths, as 
far as they do not contain the zero value. These constructions allow one to 
embed any such time series in a single vector space, that some how gener- 
alizes the notion of Euclidean vector space. The recursive structure of the 
construction offers the possibility to manage several time elastic dimensions. 
Some applicative benefits could be expected in time series analysis when time 
elasticity is an issue, for instance in the field of numeric or symbolic sequence 
data mining. 

References 

[1] V. M. Velichko, N. G. Zagoruyko, International Journal of Man-Machine 
Studies 2 (1970) 223-234. 

[2] H. Sakoe, S. Chiba, in: Proceedings of the 7th International Congress of 
Acoustic, pp. 65-68. 

[3] L. Chen, R. Ng, in: Proceedings of the 30th International Conference on 
Very Large Data Bases, pp. 792-801. 

[4] P. F. Marteau, IEEE Trans. Pattern Anal. Mach. InteU. 31 (2009) 306- 
318. 

[5] P.-F. Marteau, S. Gibct, CoRR abs/1005.5141 (2010). 

[6] C. Berg, J. P. R. Christcnscn, P. Ressel, Harmonic Analysis on Semi- 
groups: Theory of Positive Definite and Related Functions, volume 100 
of Graduate Texts in Mathematics, Springer- Verlag, New York, 1984. 

[7] B. Scholkopf, A. J. Smola, Learning with Kernels: Support Vector Ma- 
chines, Regularization, Optimization, and Beyond, MIT Press, Cam- 
bridge, MA, USA, 2001. 

[8] J. Shawe- Taylor, N. Cristianini, Kernel Methods for Pattern Analysis, 
Cambridge University Press, New York, NY, USA, 2004. 

[9] G. Salton, M. McGill, Introduction to Modern Information Retrieval, 
McGraw-Hill Book Company, 1984. 



16 



